Search results for "XML Catalog"
showing 5 items of 5 documents
A novel XML document structure comparison framework based-on sub-tree commonalities and label semantics
2012
International audience; XML similarity evaluation has become a central issue in the database and information communities, its applications ranging over document clustering, version control, data integration and ranked retrieval. Various algorithms for comparing hierarchically structured data, XML documents in particular, have been proposed in the literature. Most of them make use of techniques for finding the edit distance between tree structures, XML documents being commonly modeled as Ordered Labeled Trees. Yet, a thorough investigation of current approaches led us to identify several similarity aspects, i.e., sub-tree related structural and semantic similarities, which are not sufficient…
A Life Cycle Model of XML Documents
2014
Electronic documents produced in business processes are valuable information resources for organizations. In many cases they have to be accessible long after the life of the business processes or information systems in connection with which they were created. To improve the management and preservation of documents, organizations are deploying Extensible Markup Language (XML) as a standardized format for documents. The goal of this paper is to increase understanding of XML document management and provide a framework to enable the analysis and description of the management of XML documents throughout their life. We followed the design science approach. We introduce a document life cycle model…
Aspects on XML Document Content Reuse in Organizaotins
2007
Designing the reuse of information residing in documents is more complex than for information in databases. Document content is designed for humans and organized with regard to communicational purposes for organizational work. In addition, content organization within documents is affected by the requirements of multichannel publishing and layout design for content presentation. Efficient content reuse in organizational documents requires that the ways the content is created and stored within and across documents and other content resources, such as databases, should be identified. XML provides technological means for document content reuse. The designers of XML document production need to b…
An overview on XML similarity: Background, current trends and future directions
2009
In recent years, XML has been established as a major means for information management, and has been broadly utilized for complex data representation (e.g. multimedia objects). Owing to an unparalleled increasing use of the XML standard, developing efficient techniques for comparing XML-based documents becomes essential in the database and information retrieval communities. In this paper, we provide an overview of XML similarity/comparison by presenting existing research related to XML similarity. We also detail the possible applications of XML comparison processes in various fields, ranging over data warehousing, data integration, classification/clustering and XML querying, and discuss some…
Extensible User-Based XML Grammar Matching
2009
International audience; XML grammar matching has found considerable interest recently due to the growing number of heterogeneous XML documents on the web and the increasing need to integrate, and consequently search and retrieve XML data originated from different data sources. In this paper, we provide an approach for automatic XML grammar matching and comparison aiming to minimize the amount of user effort required to perform the match task. We propose an open framework based on the concept of tree edit distance, integrating different matching criterions so as to capture XML grammar element semantic and syntactic similarities, cardinality and alternativeness constraints, as well as data-ty…